Statistics of multiple PWM occurrences in biopolymer sequences

نویسندگان

  • Valentina A. Boeva
  • Mireille Regnier
  • Mikhail A. Roytberg
  • Vsevolod J. Makeev
چکیده

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On counting position weight matrix matches in a sequence, with application to discriminative motif finding

MOTIVATION AND RESULTS The position weight matrix (PWM) is a popular method to model transcription factor binding sites. A fundamental problem in cis-regulatory analysis is to "count" the occurrences of a PWM in a DNA sequence. We propose a novel probabilistic score to solve this problem of counting PWM occurrences. The proposed score has two important properties: (1) It gives appropriate weigh...

متن کامل

A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences

The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...

متن کامل

Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer

The algorithm described in this paper discovers one or more motifs in a collection of DNA or protein sequences by using the technique of expectation maximization to fit a two-component finite mixture model to the set of sequences. Multiple motifs are found by fitting a mixture model to the data, probabilistically erasing the occurrences of the motif thus found, and repeating the process to find...

متن کامل

Approximation of word counts in Markov chains

In this talk, we give an overview about the diierent approximation results existing on the statistical distribution of word counts in a Markov chain. Results concerning the number of overlapping occurrences, the number of non-overlapping occurrences (renewals) and the declumped count will be presented. Counts of single words but also multiple words and word families are considered. We will see ...

متن کامل

Overview of the PWMEnrich package

The main functionality of the package is Position Weight Matrix (PWM) enrichment analysis in a single sequence (e.g. enhancer of interest) or a set of sequences (e.g. set of ChIP-chip/seq peaks). Note that this is not the same as de-novo motif finding which discovers novel motifs, nor motif comparison which aligns motifs. The package is built upon Biostrings and offers high-level functions to s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007